Daniela’s Notes for Nicemboim, Schad, & Vasishth (2024)
  • D. Palleschi
  1. Book notes
  2. 2  Chapter notes
  • 1  workbook
  • Book notes
    • 2  Chapter notes
    • 3  Contrast Coding
  • Book exercises
    • 4  Ch. 1 Exercises
    • 5  Ch. 8 Exercises
  • Literaturverzeichnis

Chatper contents

  • Set up
  • 3 Ch. 1 - Intro
    • 3.1 Probability
      • 3.1.1 Conditional probability and Bayes’ rule
      • 3.1.2 Law of total probability
    • 3.2 Discrete random variables
      • 3.2.1 The mean and variance of the binomial distribution
      • 3.2.2 Compute probability of a particular outcome (discrete): dibinom
      • 3.2.3 Compute cumulative probability: pbinom
      • 3.2.4 Compute the inverse of the CDF (quantile function): qbinom
        • 3.2.4.1 Generage simulated data from binomial distribtion: rbinom
    • 3.3 Continuous random variables
      • 3.3.1 An important distinction: probability vs. densitiy in continuous random variables
      • 3.3.2 Truncating a normal distribution
    • 3.4 Bivariate and multivariate distributions
      • 3.4.0.1 Marginal distributions
      • 3.4.1 Generate simulated bivariate (multivariate) data
    • 3.5 An important concept: the marginal likelihood (integrating out a parameter)
  • 4 Session Info
  1. Book notes
  2. 2  Chapter notes

2  Chapter notes

Autor:in

Daniela Palleschi

These notes are based on (nicenboim_introduction_nodate?), both from a PDF version supplied by the authors and the html version available here (accessed in early 2023). Much of the notes are taken verbatim from the book, as are code snippets.

Set up

# set global knit options
knitr::opts_chunk$set(echo = T, # print chunks?
                      eval = T, # run chunks?
                      error = F, # print errors?
                      warning = F, # print warnings?
                      message = F, # print messages?
                      cache = T # cache?; be careful with this!
                      )

# suppress scientific notation
options(scipen=999)

# play a sound if error encountered
options(error = function() {beepr::beep(9)})

# load packages
## create list of package names
packages <- c( #"SIN", # this package was removed from the CRAN repository
               "MASS", "dplyr", "tidyr", "purrr", "extraDistr", "ggplot2", "loo", "bridgesampling", "brms", "bayesplot", "tictoc", "hypr", "bcogsci", "papaja", "grid", "kableExtra", "gridExtra", "lme4", "cowplot", "pdftools", "cmdstanr", "rootSolve", "rstan"
  )

# NB: if you haven't already installed bcogsci through devtools, it won't be loaded
## Now load or install & load all
package.check <- lapply(
  packages,
  FUN = function(x) {
    if (!require(x, character.only = TRUE)) {
      install.packages(x, dependencies = TRUE)
      library(x, character.only = TRUE)
    }
  }
)

# this is also required, taken from the textbook

## Save compiled models:
rstan_options(auto_write = FALSE)
## Parallelize the chains using all the cores:
options(mc.cores = parallel::detectCores())
# To solve some conflicts between packages
select <- dplyr::select
extract <- rstan::extract

3 Ch. 1 - Intro

  • given some data, how to use Bayes’ theorem to quantify uncertainty about our belief regarding a scientific question of interest
  • topics to be understood:
    • the basic concepts behind probability
    • the concept of random variables
    • probability distributions
    • the concept of likelihood

3.1 Probability

Frequency-based versus uncertain-belief perspective of probability:

  1. repeatable events, like rolling a die and getting a 6, are frequentist because probability is related to the frequency at which we’d observe an outcome given repeated observations
  2. one-of-a-kind events, like earthquakes, don’t work with this idea of probability
  • the probability of an earthquake expresses our uncertainty about an event happening
  • we also be uncertain about how probable an event is: being 90% sure something is 50% likely to happen
  • this is what we’re interested in: how uncertain we are of an estimate

In Bayesian analysis, we want to express our uncertainty about the probability of observing an outcome (prior distribution).

3.1.1 Conditional probability and Bayes’ rule

  • A = “the streets are wet”
  • B = “it was raining”
  • P(A|B) = the probability of A given B
  • P(A,B) = P(A|B)P(B) (the probability of A and B happening)

3.1.2 Law of total probability

  • dunno

3.2 Discrete random variables

Generating random sequences of simulated data with a binomial distribution. Imagine a cloze task, where we consider a particular word a success (1) and any other word a failure (0). If we run the experiment 20 times with a sample size of 10, the cloze probabilities for these 20 experiments would be:

rbinom(10, n = 20, prob = .5)

[1] 7 7 7 5 6 7 6 4 5 3 8 2 2 5 4 1 6 6 3 7

For discrete random variables such as the binomial, the probability distribution p(y|\(\theta\)) is called a probability mass function (PMF) . The PMF defines the probability of each possible outcome. With n = 10 trials, there are 11 possible outcomes (0, 1, 2,…10 succeses). Which outcome is most probable depends on the parameter \(\theta\) that represents the probability of success. Above, we set \(\theta\) to 0.5.

3.2.1 The mean and variance of the binomial distribution

In real exerimental situations we never know the true value of \(\theta\) (probability of an outcome), but it can be derived from the data: \(\theta\) hat = k/n, where k = number of observed successess, n = number of trials, and \(\theta\) hat = observed proportion of successes. \(\theta\) hat = maximum likelihood estimate of the true but unknown parameter \(\theta\). Basically, the mean of the binomial distribution. The variance can also be estimated by computing (n(\(\theta\)))(1 - \(\theta\)). These estimates can be be used for statistical inference.

3.2.2 Compute probability of a particular outcome (discrete): dibinom

dbinom calculates probability of k successes out of n given a particular \(\theta\).

dbinom(5, size = 10, prob = .5)
[1] 0.2460938
dbinom(5, size = 10, prob = .1)
[1] 0.001488035
dbinom(5, size = 10, prob = .9)
[1] 0.001488035

With continuous data, the probability of obtaining an exact value will always be zero. We’ll come ot this later.

3.2.3 Compute cumulative probability: pbinom

The cumulative distribution function (CDF): essentially the sum of all probabilities of the values of k you are interested in. E.g., the probability of observing 2 successes or fewer (0, 1, or 2) is:

# sum of probabilities for exact k's
dbinom(0, size = 10, prob = .5) +
  dbinom(1, size = 10, prob = .5) +
  dbinom(2, size = 10, prob = .5)
[1] 0.0546875
# or
sum(dbinom(0:2, size = 10, prob = .5))
[1] 0.0546875
# or use pbinom()
pbinom(2, size = 10, prob = 0.5, lower.tail = TRUE)
[1] 0.0546875
# conversely, what is the $\theta$ of observing THREE successes or more?
pbinom(2, size = 10, prob = 0.5, lower.tail = F)
[1] 0.9453125
# or
sum(dbinom(3:10, size = 10, prob = .5))
[1] 0.9453125
# the probability of observing 10 or fewer successes (out of 10 trials)
pbinom(10, size = 10, prob = 0.5, lower.tail = TRUE)
[1] 1

3.2.4 Compute the inverse of the CDF (quantile function): qbinom

The quantile function (the inverse CDF) obtains the value of k (the quantile) given the probability of obtaining k or less than k successes given some specific probability value p:

# reverse of dbinom(2,10,.5) would be:
qbinom(0.0546875, size=10, prob=.5)
[1] 2

3.2.4.1 Generage simulated data from binomial distribtion: rbinom

# given 1 iteration of 10 trials where p = .5, produce a random value of k
rbinom(1, 10, .5)
[1] 6

3.3 Continuous random variables

Imagine vector of reading times data with a normal distribution, defined by its mean and its sd. The probability density function (PDF) for particular values of mean and sd (assuming a normal distribution) can be calculated using dnorm. The CDF can be found using pnorm, and the inverse CDF using qnorm. These are 3 different ways of looking at the infrmation.

# p of observing a mean of 250ms when the true mean is 500 & sd = 100 (PDF)
dnorm(400,mean = 500, sd = 100)
[1] 0.002419707
# p of observing 400ms *or lower* when the true mean is 500 & sd = 100 (CDF)
pnorm(400,mean = 500, sd = 100)
[1] 0.1586553
# k with a CDF of 0.1586553 when the true mean is 500 & sd = 100 (inverse CDF)
qnorm(0.1586553, mean = 500, sd = 100)
[1] 400

Question: what is the probability of observing values between 200 and 700 from a normal distribution where mean = 500 and sd = 100?

pnorm(700,500,100) - pnorm(200,500,100)
[1] 0.9759

With continuous data, it is only meaningful to ask about probabilities between two point values (e.g., probability that Y lies between a and b).

What is the quantile q such that the probability of observing that value or something less (or more) than it is 0.975 (given the normal(500,100) distribution)?

qnorm(0.975, m=500, sd=100)
[1] 695.9964

Next task: generate simulated data. generate 10 data points using the rnorm function and use this simulated data to compute the mean and stanrdard devaition.

x <- rnorm(10,500,100)
mean(x)
[1] 600.5572
sd(x)
[1] 78.86937
# can also computer lower and upper bounds of 95% CIs
quantile(x, probs = c(.025, .975))
    2.5%    97.5% 
469.3127 720.3811 

3.3.1 An important distinction: probability vs. densitiy in continuous random variables

The probability density function (PDF):

# density with default m = 0 and sd = 1
dnorm(1)
[1] 0.2419707

This is not the probability of observing 1 in this distribution, as the probability of a single value in a continous distribtion will always be 0. This is becaue probability in a continuous distritubion is the area under the curve, and at a single point there is no area under the curve (i.e., p = 0). The pnorm function allows us to find the cumulative distribution function (CDF) for the normal distribution.

For example, the probability of obseving a value etween +/-2 in a normal distribution with mean 0 and sd 1:

pnorm(2, m = 0, sd = 1) - pnorm(-2, m = 0, sd = 1)
[1] 0.9544997

For discrete random variables, the situation is different. These have a probability mass function (PMF), the binomial distribution that we saw before. Here, the PMF maps the possible y values to the probabilities of those exact values occurring.

dbinom(2,size=10,prob=.5)
[1] 0.04394531

3.3.2 Truncating a normal distribution

Refers to positive values only (truncating at 0).

3.4 Bivariate and multivariate distributions

Consider a case where two discrete responses were recorded: a binary yes/no response, and a Likert acceptability rating (1-7).

The joint probability mass function is the joint PMF of two random variables.

Let’s play around with some such data:

# run if package is not loaded
# library(bcogsci)
data("df_discreteagrmt")

3.4.0.1 Marginal distributions

The marginal distribution of each pair of values (let’s say x = the binary response, y = the Likert response) is computed by summing up

rowSums(probs)

object probs is not defined in the book

3.4.1 Generate simulated bivariate (multivariate) data

Suppose we want to generate 100 pairs of correlated data, with correlation rho = 0.6. The two random variables have mean 0, and standard deviations 5 and 10 respectively.

## define a variance-covariance matrix:
Sigma <- matrix(c(5^2, 5 * 10 * .6, 5 * 10 * .6, 10^2),
  byrow = FALSE, ncol = 2
)
## generate data:
u <- mvrnorm(
  n = 100,
  mu = c(0, 0),
  Sigma = Sigma
)
head(u, n = 3)
           [,1]      [,2]
[1,]   4.187286  8.903810
[2,] -12.405573 -7.363094
[3,]   2.121652 -6.667865
# plot the data
ggplot(tibble(u_1 = u[, 1], u_2 = u[, 2]), aes(u_1, u_2)) +
  geom_point()

3.5 An important concept: the marginal likelihood (integrating out a parameter)

4 Session Info

Compiled with R version 4.4.0 (2024-04-24) (Puppy Cup) in RStudio version 2023.3.0.386 (Cherry Blossom).

sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Ventura 13.2.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] grid      stats     graphics  grDevices datasets  utils     methods  
[8] base     

other attached packages:
 [1] rstan_2.32.6         StanHeaders_2.32.7   rootSolve_1.8.2.4   
 [4] cmdstanr_0.7.1       pdftools_3.4.0       cowplot_1.1.3       
 [7] lme4_1.1-35.3        Matrix_1.7-0         gridExtra_2.3       
[10] kableExtra_1.4.0     papaja_0.1.2         tinylabels_0.2.4    
[13] bcogsci_0.0.0.9000   hypr_0.2.8           tictoc_1.2.1        
[16] bayesplot_1.11.1     brms_2.21.0          Rcpp_1.0.12         
[19] bridgesampling_1.1-2 loo_2.7.0            ggplot2_3.5.1       
[22] extraDistr_1.10.0    purrr_1.0.2          tidyr_1.3.1         
[25] dplyr_1.1.4          MASS_7.3-60.2       

loaded via a namespace (and not attached):
 [1] Rdpack_2.6           inline_0.3.19        sandwich_3.1-0      
 [4] rlang_1.1.3          magrittr_2.0.3       multcomp_1.4-25     
 [7] matrixStats_1.3.0    compiler_4.4.0       systemfonts_1.0.6   
[10] vctrs_0.6.5          stringr_1.5.1        pkgconfig_2.0.3     
[13] fastmap_1.1.1        backports_1.4.1      labeling_0.4.3      
[16] effectsize_0.8.7     utf8_1.2.4           rmarkdown_2.26      
[19] pracma_2.4.4         ps_1.7.6             nloptr_2.0.3        
[22] xfun_0.43            jsonlite_1.8.8       parallel_4.4.0      
[25] R6_2.5.1             stringi_1.8.4        boot_1.3-30         
[28] estimability_1.5     knitr_1.46           zoo_1.8-12          
[31] parameters_0.21.6    splines_4.4.0        tidyselect_1.2.1    
[34] rstudioapi_0.16.0    abind_1.4-5          yaml_2.3.8          
[37] codetools_0.2-20     processx_3.8.4       pkgbuild_1.4.4      
[40] qpdf_1.3.3           lattice_0.22-6       tibble_3.2.1        
[43] withr_3.0.0          bayestestR_0.13.2    askpass_1.2.0       
[46] posterior_1.5.0      coda_0.19-4.1        evaluate_0.23       
[49] survival_3.5-8       RcppParallel_5.1.7   xml2_1.3.6          
[52] pillar_1.9.0         tensorA_0.36.2.1     checkmate_2.3.1     
[55] renv_0.17.2          stats4_4.4.0         insight_0.19.10     
[58] distributional_0.4.0 generics_0.1.3       rstantools_2.4.0    
[61] munsell_0.5.1        scales_1.3.0         minqa_1.2.6         
[64] xtable_1.8-4         glue_1.7.0           emmeans_1.10.1      
[67] tools_4.4.0          mvtnorm_1.2-4        rbibutils_2.2.16    
[70] QuickJSR_1.1.3       datawizard_0.10.0    colorspace_2.1-0    
[73] nlme_3.1-164         cli_3.6.2            fansi_1.0.6         
[76] viridisLite_0.4.2    svglite_2.1.3        Brobdingnag_1.2-9   
[79] gtable_0.3.5         digest_0.6.35        TH.data_1.1-2       
[82] htmlwidgets_1.6.4    farver_2.1.1         htmltools_0.5.8.1   
[85] lifecycle_1.0.4     

References

1  workbook
3  Contrast Coding
Quellcode
---
title: "Chapter notes"
author: "Daniela Palleschi"
---


```{r, eval = T, echo = F, message = F}
# Create references.json file based on the citations in this script
# rbbt::bbt_update_bib("book_notes.qmd")
```

These notes are based on @nicenboim_introduction_nodate, both from a PDF version supplied by the authors and the html version available [here](https://vasishth.github.io/bayescogsci/book/) (accessed in early 2023). Much of the notes are taken verbatim from the book, as are code snippets.

::: {#refs custom-style="Bibliography"}
:::



# Set up{-}

```{r, results = "hide", warning=F,message=F,error=F}
# set global knit options
knitr::opts_chunk$set(echo = T, # print chunks?
                      eval = T, # run chunks?
                      error = F, # print errors?
                      warning = F, # print warnings?
                      message = F, # print messages?
                      cache = T # cache?; be careful with this!
                      )

# suppress scientific notation
options(scipen=999)

# play a sound if error encountered
options(error = function() {beepr::beep(9)})

# load packages
## create list of package names
packages <- c( #"SIN", # this package was removed from the CRAN repository
               "MASS", "dplyr", "tidyr", "purrr", "extraDistr", "ggplot2", "loo", "bridgesampling", "brms", "bayesplot", "tictoc", "hypr", "bcogsci", "papaja", "grid", "kableExtra", "gridExtra", "lme4", "cowplot", "pdftools", "cmdstanr", "rootSolve", "rstan"
  )

# NB: if you haven't already installed bcogsci through devtools, it won't be loaded
## Now load or install & load all
package.check <- lapply(
  packages,
  FUN = function(x) {
    if (!require(x, character.only = TRUE)) {
      install.packages(x, dependencies = TRUE)
      library(x, character.only = TRUE)
    }
  }
)

# this is also required, taken from the textbook

## Save compiled models:
rstan_options(auto_write = FALSE)
## Parallelize the chains using all the cores:
options(mc.cores = parallel::detectCores())
# To solve some conflicts between packages
select <- dplyr::select
extract <- rstan::extract
```

# Ch. 1 - Intro

- given some data, how to use Bayes’ theorem to ***quantify uncertainty about our belief*** regarding a scientific question of interest
- topics to be understood:
  - the basic concepts behind probability
  - the concept of random variables
  - probability distributions
  - the concept of likelihood

## Probability

Frequency-based versus uncertain-belief perspective of probability:

1. repeatable events, like rolling a die and getting a 6, are *frequentist* because **probability** is related to the *frequency* at which we'd observe an outcome given repeated observations
2. one-of-a-kind events, like earthquakes, don't work with this idea of probability
  - the probability of an earthquake expresses our *uncertainty* about an event happening
  - we also be *uncertain* about how probable an event is: being 90% sure something is 50% likely to happen
  - this is what we're interested in: how uncertain we are of an estimate
  
In Bayesian analysis, we want to express our uncertainty about the probability of observing an outcome (*prior distribution*).

### Conditional probability and Bayes' rule

- A = "the streets are wet"
- B = "it was raining"
- P(A|B) = the probability of A given B
- P(A,B) = P(A|B)P(B) (the probability of A and B happening)

### Law of total probability

- dunno

## Discrete random variables

Generating random sequences of simulated data with a binomial distribution. Imagine a cloze task, where we consider a particular word a success (1) and any other word a failure (0). If we run the experiment 20 times with a sample size of 10, the cloze probabilities for these 20 experiments would be:

```{r, results="asis"}
rbinom(10, n = 20, prob = .5)
```

For discrete random variables such as the binomial, the probability distribution *p(y|$\theta$)* is called a probability mass function (PMF) . The PMF defines the probability of each possible outcome. With *n* = 10 trials, there are 11 possible outcomes (0, 1, 2,...10 succeses). Which outcome is most probable depends on the parameter $\theta$ that represents the probability of success. Above, we set $\theta$ to `0.5`.

### The mean and variance of the binomial distribution

In real exerimental situations we never know the true value of $\theta$ (probability of an outcome), but it can be derived from the data: *$\theta$ hat = k/n*, where *k* = number of observed successess, *n* = number of trials, and *$\theta$ hat* = observed proportion of successes. *$\theta$ hat* = ***maximum likelihood estimate*** of the true but unknown parameter *$\theta$*. Basically, the **mean** of the binomial distribution. The **variance** can also be estimated by computing *(n($\theta$))(1 - $\theta$)*. These estimates can be be used for statistical inference.

### Compute probability of a particular outcome (discrete): dibinom

`dbinom` calculates probability of *k* successes out of *n* given a particular *$\theta$*.

```{r}
dbinom(5, size = 10, prob = .5)
dbinom(5, size = 10, prob = .1)
dbinom(5, size = 10, prob = .9)
```

With continuous data, the probability of obtaining an exact value will always be zero. We'll come ot this later.

### Compute cumulative probability: pbinom

The cumulative distribution function (CDF): essentially the sum of all probabilities of the values of *k* you are interested in. E.g., the probability of observing 2 successes or fewer (0, 1, or 2) is:

```{r}
# sum of probabilities for exact k's
dbinom(0, size = 10, prob = .5) +
  dbinom(1, size = 10, prob = .5) +
  dbinom(2, size = 10, prob = .5)

# or
sum(dbinom(0:2, size = 10, prob = .5))

# or use pbinom()
pbinom(2, size = 10, prob = 0.5, lower.tail = TRUE)
# conversely, what is the $\theta$ of observing THREE successes or more?
pbinom(2, size = 10, prob = 0.5, lower.tail = F)
# or
sum(dbinom(3:10, size = 10, prob = .5))

# the probability of observing 10 or fewer successes (out of 10 trials)
pbinom(10, size = 10, prob = 0.5, lower.tail = TRUE)
```

### Compute the inverse of the CDF (quantile function): qbinom

The quantile function (the inverse CDF) obtains the value of *k* (the quantile) given the probability of obtaining *k* or less than *k* successes given some specific probability value *p*:

```{r}
# reverse of dbinom(2,10,.5) would be:
qbinom(0.0546875, size=10, prob=.5)
```

#### Generage simulated data from binomial distribtion: rbinom

```{r}
# given 1 iteration of 10 trials where p = .5, produce a random value of k
rbinom(1, 10, .5)
```

## Continuous random variables

Imagine vector of reading times data with a normal distribution, defined by its *mean* and its *sd*. The ***probability density function*** (PDF) for particular values of mean and sd (assuming a normal distribution) can be calculated using `dnorm`. The CDF can be found using `pnorm`, and the inverse CDF using `qnorm`. These are 3 different ways of looking at the infrmation.

```{r}
# p of observing a mean of 250ms when the true mean is 500 & sd = 100 (PDF)
dnorm(400,mean = 500, sd = 100)

# p of observing 400ms *or lower* when the true mean is 500 & sd = 100 (CDF)
pnorm(400,mean = 500, sd = 100)

# k with a CDF of 0.1586553 when the true mean is 500 & sd = 100 (inverse CDF)
qnorm(0.1586553, mean = 500, sd = 100)
```

Question: what is the probability of observing values between 200 and 700 from a normal distribution where mean = 500 and sd = 100?

```{r}
pnorm(700,500,100) - pnorm(200,500,100)
```

With continuous data, it is only meaningful to ask about probabilities between two point values (e.g., probability that Y lies between a and b).

What is the quantile *q* such that the probability of observing that value or something less (or more) than it is 0.975 (given the normal(500,100) distribution)?

```{r}
qnorm(0.975, m=500, sd=100)
```

Next task: generate simulated data. generate 10 data points using the `rnorm` function and use this simulated data to compute the mean and stanrdard devaition.

```{r}
x <- rnorm(10,500,100)
mean(x)
sd(x)

# can also computer lower and upper bounds of 95% CIs
quantile(x, probs = c(.025, .975))
```

### An important distinction: probability vs. densitiy in continuous random variables

The probability density function (PDF):
```{r}
# density with default m = 0 and sd = 1
dnorm(1)
```

This is not the probability of observing 1 in this distribution, as the probability of a single value in a continous distribtion will always be 0. This is becaue probability in a continuous distritubion is the ***area under the curve***, and at a single point there is no area under the curve (i.e., p = 0). The `pnorm` function allows us to find the cumulative distribution function (CDF) for the normal distribution.

For example, the probability of obseving a value etween +/-2 in a normal distribution with mean 0 and sd 1:
```{r}
pnorm(2, m = 0, sd = 1) - pnorm(-2, m = 0, sd = 1)
```

For ***discrete*** random variables, the situation is different. These have a probability **mass** function (PMF), the binomial distribution that we saw before. Here, the PMF maps the possible *y* values to the probabilities of those exact values occurring.

```{r}
dbinom(2,size=10,prob=.5)
```

### Truncating a normal distribution

Refers to positive values only (truncating at 0).

## Bivariate and multivariate distributions

Consider a case where two discrete responses were recorded: a binary yes/no response, and a Likert acceptability rating (1-7).

The ***joint probability mass function*** is the joint PMF of two random variables.

Let's play around with some such data:
```{r}
# run if package is not loaded
# library(bcogsci)
data("df_discreteagrmt")
```

#### Marginal distributions

The marginal distribution of each pair of values (let's say *x* = the binary response, *y* = the Likert response) is computed by summing up 

```{r, eval = F}
rowSums(probs)
```

***object `probs` is not defined in the book***

### Generate simulated bivariate (multivariate) data

Suppose we want to generate 100 pairs of correlated data, with correlation rho = 0.6. The two random variables have mean 0, and standard deviations 5 and 10 respectively.

```{r}
## define a variance-covariance matrix:
Sigma <- matrix(c(5^2, 5 * 10 * .6, 5 * 10 * .6, 10^2),
  byrow = FALSE, ncol = 2
)
## generate data:
u <- mvrnorm(
  n = 100,
  mu = c(0, 0),
  Sigma = Sigma
)
head(u, n = 3)
```

```{r}
# plot the data
ggplot(tibble(u_1 = u[, 1], u_2 = u[, 2]), aes(u_1, u_2)) +
  geom_point()
```

## An important concept: the marginal likelihood (integrating out a parameter)

# Session Info

Compiled with `r R.version$version` (`r R.version$nickname`) in RStudio version 2023.3.0.386 (Cherry Blossom).

```{r}
#| eval: false
#| echo: false
RStudio.Version()$version; RStudio.Version()$release_name
```

```{r}
#| echo: false
system("say -v Moira your script has finished running")
```

```{r}
sessionInfo()
```

# References {.unlisted .unnumbered visibility="uncounted"}

::: {#refs custom-style="Bibliography"}
:::